vnrfandomcom-20200214-history
VNR/Shared Dictionary
= VNR/Shared Dictionary = This page introduces the usage of the Shared Dictionary dialog in VNR. Introduction This shared dictionary serves to improve machine translation for game texts and window texts. It will not affect user-contributed subtitles. The terms defined in the dictionary will be replaced in the text before or after machine translation. The dictionary can be a double-edged sword and be careful that don't let it hurt you >_< Basic usage Just press "New" to add a new entry, and edit "pattern" and "translation" columns. Then, VNR will replace "pattern" with the "translation" in new subtitles and window texts. Editable cells are in green color. Read-only columns are in blue color. Internet access is required for adding or modifying the entries in the dictionary, so that all the changes could be saved online. When offline, the dictionary will become read-only. VNR will automatically update the dictionary every a few days. But you can also press "Refresh" to run the update manually. Macros Macros can be used to define reusable regular expression patterns. Some important macros are listed here. bos Beginning of a sentence. Example: 俺 matches nothing in お前の物は俺の物、俺の物も俺の物。. eos End of a sentence. Example: 物 matches only the last 物''' in '''お前の物は俺の物、俺の物も俺の物。. boc Beginning of a clause. Almost the same as bos except that the punctuations include comma. Example: 俺 matches only the second 俺''' in '''お前の物は俺の物、俺の物も俺の物。. eoc End of a clause. Almost the same as eos except that the punctuations include comma. Example: 物 matches both the second and last 物''' in '''お前の物は俺の物、俺の物も俺の物。. Detailed help Term type There is a "type" column, which determines when to apply the replacement of entries in the dictionary. * Translation: translate the text from input language (Japanese) to output language (your language) * Input: used to repair text in input language (Japanese) before applying machine translation * Output: used to repair text in output language (your language) after applying machine translation * Name: Japanese names, such as 「釘宮」 * Suffix: the titles after Japanese names, such as 「ちゃん」 * Game: used to repair the text extracted from the game * TTS: used to repair the text before applying TTS * OCR: used to repair the text after applying OCR * Macro: reusable regular expression patterns Language matching The terms in the dictionary will be ignored unless their languages matches your language in Preferences/Account. The term will be applied if one of the following is true. # The term's language is "All languages". # The term's language is the same as yours. # The term's language is English, and your language is not an Asian language. # The term's language is either Simplified or Traditional Chinese, and your language is Chinese. Term priority Sometimes, two terms might interact with each other. For example, they might have the same or similar patterns. When that happens, the term that has longer pattern will be evaluated first. Permissions * Everyone can add new entries to the dictionary. * You edit or delete your own entries. (Editable cells are in green color, while read-only columns are in blue color.) * You cannot directly edit others' entries. * Guest user can no longer modify anonymous entries added 7 days ago. * Non-guest user can improve or disable others' entries. (Disabled entries are struck out). * To disable others' entries, you MUST provide a short reason explaining what is wrong with others' translation. This would help us revert malicious editing and remove potential scams. Series-specific terms Series specific terms will only be applied to the selected game series. The games that in the same series, such like the first game and FD will share series-specific terms. You can find the game series in Game Information. For example, if the term is a ambiguous character name, or a rule to remove garbage texts, you might want to mark it as series-specific. Character and title terms You can identify your translations as Character names. If your language is an European language, it is similar to Japanese terms. Otherwise, if your language is an Asian language, it is similar to Escaped terms. The difference is that VNR will try to translate titles around character names. The titles can be specified using title terms. It's pattern must matches the text immediately following a Japanese name. For example, if you define a chara term to translate 「ヴァイス」 to "Weiss", and a title term 「将軍」 as "-general" then VNR will try to translate 「ヴァイス将軍」 to "Weiss-general". Note: There are certain limitations for describing names and terms using regular expressions. When regular expression is enabled, "|" (the "or" operator) and captures like "\1" cannot be used, or it will mess up your translation. Improving unparsing furigana VNR could use MeCab and MS Japanese IME to unparse Japanese text to furigana. Non-regular-expression terms for TTS and character names will also be used to improve unparsing furigana. Their patterns will be used to split Japanese text. Additionally, the translation of TTS terms will be used to replace the furigana from MeCab or MS Japanese IME. Hentai terms Certain Japanese words have different meanings when come to H-scene. For example, 「出る」 usually means ejaculation instead of getting out under the context. Also, some translations might be offensive to certain group of people. You could mark these terms as Hentai so that other users could select whether to enable them or not. Private terms Private terms are only visible to the creators of translations. Other people will not see the content of the translation. For example, if you want to translate the Hero's name to your real name, and the Heroine's name to your GF, you could mark the terms as "Private" so that other people will not know the real names. Regular expression A regular expression is a string expression to describe some pattern of strings. As the meaning of words are usually context-sensitive, regular expressions allows us to describe the context of a term in the dictionary. But please DO NOT use the regular expression before you understand its syntax. Because regular expressions are slower, and a bad regular expression could ruin all translations for all users. Here are some examples of useful regular expressions: * かわいい+ -- match "かわいい", "かわいいい", "かわいいいい", ... * わあたし -- match both "わたし" and "あたし" * (?<=あたし)は -- match "は" only after "あたし" (i.e. exclude other "は") * かわいい(?=です) -- match "かわいい" only before "です" * かわい(?!い) -- match "かわい" but exclude "かわいい" * です(?=。、？！…」) -- match "です" only at the end of a sentence * (?<=(?:^|。、？！…「))あたし -- match "あたし" only at the beginning of a sentence The regular expression could also help VNR identify character names in the game text. For example, in 「まじこいS-1」, the original game text is like: 「Scenario」CharaName. And we want to change it to: CharaName「Scenario」. You can achieve it by adding a term as follows: * Target: Original text * Language: Japanese * Game-specific: YES -- i.e. only apply to this game * Regular expression: YES * Pattern: (「.*」)(.*) -- Match scenarios as \1 and chara as \2 * Replacement: \2\1 -- swap \1 and \2 Another example, in 「媚肉の香り」, the original game text is like: ＜CharaName＞：Scenario. And if you want to change it to: CharaName「Scenario」, you can add a term as follows to achieve it: * Target: Original text * Language: Japanese * Game-specific: YES -- i.e. only apply to this game * Regular expression: YES * Pattern: ＜(.*?)＞：(.*) -- Match chara as \1 and scenarios as \2 * Replacement: \1「\2」 -- wrap \2 with「」 Repetition elimination There are options in Text Settings for removing repeating text. But the granularity is so coarse that you cannot control how exactly to remove repetitions. It might remove more things than what you want. If the default repetition filters do not work well, you can use the regular expression in Shared Dictionary for fine-grain finite repetition removal. For example, in 「天色＊アイルノーツ」, the original game text is like "CharaCharaChara「Dialog」", where the character names at the beginning repeat for exact 3 times. We want to to change it to "Chara「Dialog」". The default repetition filter will remove repeats in not only Chara, but also Dialog, which is not what we want. So, it will be better to add the following term to rewrite source text: * Pattern: ^(.+)\1\1 * Replacement: \1 In English, it means that "to replace whatever is repeating for 3 times at the beginning of the sentence with the repeated text". Here, ".+" means whatever. "(.+)" means to let whatever = "\1". "(.+)\1" means whatever has repeated twice. "(.+)\1\1" means whatever has repeated three times. The "^" sign means it only happens at the beginning of a sentence. Actually, the built-in finite repetition filter in Text Settings is achieved using similar regular expressions. It is similar to the following logic: * Pattern: (.+)\1+ * Replacement: \1 In English, it means to remove whatever has repeated for more than one times. It might sound dangerous. However, for example, "AAABAAAB" will become "AAAB" instead of "ABAB". This is because the regular expression will try to match maximum number of repeated characters, which will be "AAAB" instead of "A". Other notes and tips * The modification to the dictionary will be saved online after VNR is idle for 5 seconds. When VNR is saving your changes, the Green Ring out of the Blue Button will be spinning. * You can edit the "comment" column to explain where the translation serves. * Make sure that the pattern text is relative unique and is not too short, or it might downgrade the translation quality. * Press "Refresh" button will also refresh window text translations. But it will not refresh game text translations at the moment.